Rough Set-based Dimensionality Reduction for Supervised and Unsupervised Learning

نویسندگان

  • Qiang SHEN
  • Alexios CHOUCHOULAS
چکیده

The curse of dimensionality is a damning factor for numerous potentially powerful machine learning techniques. Widely approved and otherwise elegant methodologies used for a number of different tasks ranging from classification to function approximation exhibit relatively high computational complexity with respect to dimensionality. This limits severely the applicability of such techniques to real world problems. Rough set theory is a formal methodology that can be employed to reduce the dimensionality of datasets as a preprocessing step to training a learning system on the data. This paper investigates the utility of the Rough Set Attribute Reduction (RSAR) technique to both supervised and unsupervised learning in an effort to probe RSAR’s generality. FuREAP, a Fuzzy-Rough Estimator of Algae Populations, which is an existing integration of RSAR and a fuzzy Rule Induction Algorithm (RIA), is used as an example of a supervised learning system with dimensionality reduction capabilities. A similar framework integrating the Multivariate Adaptive Regression Splines (MARS) approach and RSAR is taken to represent unsupervised learning systems. The paper describes the three techniques in question, discusses how RSAR can be employed with a supervised or an unsupervised system, and uses experimental results to draw conclusions on the relative success of the two integration efforts.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Generalization Bounds for Supervised Dimensionality Reduction

We introduce and study the learning scenario of supervised dimensionality reduction, which couples dimensionality reduction and a subsequent supervised learning step. We present new generalization bounds for this scenario based on a careful analysis of the empirical Rademacher complexity of the relevant hypothesis set. In particular, we show an upper bound on the Rademacher complexity that is i...

متن کامل

Hybrid Attribute Reduction for Classification Based on A Fuzzy Rough Set Technique

Data usually exists with hybrid formats in real-world applications, and a unified data reduction for hybrid data is desirable. In this paper a unified information measure is proposed to computing discernibility power of a crisp equivalence relation and a fuzzy one, which is the key concept in classical rough set model and fuzzy rough set model. Based on the information measure, a general defini...

متن کامل

Information-preserving hybrid data reduction based on fuzzy-rough techniques

Data reduction plays an important role in machine learning and pattern recognition with a high-dimensional data. In real-world applications data usually exists with hybrid formats, and a unified data reducing technique for hybrid data is desirable. In this paper, an information measure is proposed to computing discernibility power of a crisp equivalence relation or a fuzzy one, which is the key...

متن کامل

Competitive Learning Algorithms in Adaptive Educational Toys

Unsupervised neural learning is typically employed in dimensionality reduction, to extract relevant features for subsequent stages of supervised learning. In this paper we examine a class of unsupervised learning algorithms used for a somewhat different purpose, that of clustering input vectors into various learned stereotyped behaviours in mobile robots [1] . Unsupervised techniques have signi...

متن کامل

Foundations of Coupled Nonlinear Dimensionality Reduction

In this paper we introduce and analyze the learning scenario of coupled nonlinear dimensionality reduction, which combines two major steps of machine learning pipeline: projection onto a manifold and subsequent supervised learning. First, we present new generalization bounds for this scenario and, second, we introduce an algorithm that follows from these bounds. The generalization error bound i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001